A Generalized Basic Cycle Calculation Method for Efficient Array Redistribution
نویسندگان
چکیده
ÐIn many scientific applications, dynamic array redistribution is usually required to enhance the performance of an algorithm. In this paper, we present a generalized basic-cycle calculation (GBCC) method to efficiently perform a BLOCK-CYCLIC(s) over P processors to BLOCK-CYCLIC(t) over Q processors array redistribution. In the GBCC method, a processor first computes the source/destination processor/data sets of array elements in the first generalized basic-cycle of the local array it owns. A generalized basic-cycle is defined as lcm sP ; tQ= gcd s; t P in the source distribution and lcm sP ; tQ= gcd s; t Q in the destination distribution. From the source/destination processor/data sets of array elements in the first generalized basic-cycle, we can construct packing/unpacking pattern tables to minimize the data-movement operations. Since each generalized basic-cycle has the same communication pattern, based on the packing/unpacking pattern tables, a processor can pack/unpack array elements efficiently. To evaluate the performance of the GBCC method, we have implemented this method on an IBM SP2 parallel machine, along with the PITFALLS method and the ScaLAPACK method. The cost models for these three methods are also presented. The experimental results show that the GBCC method outperforms the PITFALLS method and the ScaLAPACK method for all test samples. A brief description of the extension of the GBCC method to multidimensional array redistributions is also presented. Index TermsÐRedistribution, generalized basic-cycle calculation method, distributed memory multicomputers.
منابع مشابه
A Basic-Cycle Calculation Technique for Efficient Dynamic Data Redistribution
Array redistribution is usually required to enhance algorithm performance in many parallel programs on distributed memory multicomputers. Since it is performed at run-time, there is a performance trade-off between the efficiency of the new data decomposition for a subsequent phase of an algorithm and the cost of redistributing data among processors. In this paper, we present a basic-cycle calcu...
متن کاملPacking/Unpacking Information Generation for Efficient Generalized kr→r and r→kr Array Redistribution
Array redistribution is usually required to enhance algorithm performance in many parallel programs on distributed memory multicomputers. Since it is performed at run-time, there is a performance tradeoff between the efficiency of new data decomposition for a subsequent phase of an algorithm and the cost of redistributing data among processors. In this paper, we present efficient methods to gen...
متن کاملModeling nanoscale V-shaped antennas for the design of optical phased arrays
We present a simplified numerical method to solve for the current distribution in a V-shaped antenna excited by an electric field with arbitrary polarization. The scattered far-field amplitude, phase, and polarization of the antennas are extracted. The calculation technique presented here is an efficient method for probing the large design parameter space of such antennas, which have been propo...
متن کاملAn Efficient Implementation of Phase Field Method with Explicit Time Integration
The phase field method integrates the Griffith theory and damage mechanics approach to predict crack initiation, propagation, and branching within one framework. No crack tracking topology is needed, and complex crack shapes can be captures without user intervention. In this paper, a detailed description of how the phase field method is implemented with explicit dynamics into LS-DYNA is provide...
متن کاملA Generalized Processor Mapping Technique for Array Redistribution
ÐIn many scientific applications, array redistribution is usually required to enhance data locality and reduce remote memory access in many parallel programs on distributed memory multicomputers. Since the redistribution is performed at runtime, there is a performance trade-off between the efficiency of the new data decomposition for a subsequent phase of an algorithm and the cost of redistribu...
متن کامل